Skip to content

Install gate, Phase 0: vuln-api contract + test harness#110

Open
juangaitanv wants to merge 8 commits into
mainfrom
install-gate-phase-0
Open

Install gate, Phase 0: vuln-api contract + test harness#110
juangaitanv wants to merge 8 commits into
mainfrom
install-gate-phase-0

Conversation

@juangaitanv

@juangaitanv juangaitanv commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Overview

This PR starts the install-gate feature for the Corgea CLI.

The install gate will eventually check npm and PyPI package versions against Corgea's vulnerability API before install flows proceed, so vulnerable or malicious versions can be blocked before they reach the developer's environment.

This is the first PR in a stacked restart of the install-gate work. The previous attempt built the right shape, but it was too large to review as one change. This restart lands one phase per PR, each with explicit exit criteria.

This PR is Phase 0: foundation only. No user-facing command ships here, and no package install is blocked yet.

What Phase 0 Includes

Phase 0 adds the client, contract, and test scaffold that later install-gate phases will build on:

  • src/vuln_api/ - blocking client for GET /v1/packages/{eco}/{name}/versions/{ver}/check. It is independent of the shared CLI HTTP client because the vuln-api host is user-configurable via CORGEA_VULN_API_URL, so it must not replay Corgea cookies or redirects. It includes status mapping, a confused-deputy identity guard, request-time PyPI name normalization, and fixed request headers.
  • src/vuln_api_stub/ - minimal in-process TCP stub for tests, gated out of release builds behind the test-stub cargo feature. There is no standalone binary.
  • tests/common/ - GateHarness, the shared integration-test scaffold for later phases. It provides an isolated corgea, a private PATH of fake package managers, registry stubs, and vuln-api stubs.
  • tests/fixtures/vuln_api/ - committed contract response bodies for clean, vulnerable, malware, and unknown package checks.
  • tests/vuln_api_contract.rs - contract tests against the hermetic stub plus ignored staging-worker tests for the live staging endpoint.
  • tests/harness_smoke.rs - smoke coverage proving the harness wires the fake package manager, registry stub, and vuln-api stub.
  • .github/workflows/staging-contract.yml - scheduled non-blocking staging contract checks so endpoint, schema, or seed-data drift is caught out of band.

Deliberately Out Of Scope

Later phases will add:

  • user-facing install-gate commands
  • auth/token handling for protected vuln-api access
  • retries
  • package-manager interception
  • install blocking behavior

Exit Criteria - Met

Contract tests green against both the stub and staging. Another phase can write an integration test in under 20 lines of setup.

  • Hermetic contract tests + 118 in-crate unit tests pass via cargo test.
  • Staging contract tests pass against the live worker:
    cargo test --test vuln_api_contract -- --ignored
    # 5 passed (axios@0.21.0, minimist@0.0.8, node-fetch@2.6.0, mezzanine==6.0.0, unknown)
    
  • ./harness check passes.

Also Included

A one-line prerequisite fix (0e5beb4): tests/cli_deps.rs's git helper now scrubs inherited GIT_* env. Running the suite from the pre-commit hook in a worktree leaked GIT_DIR into the tests' subprocesses, pointing their git init at the developer's repo instead of the temp dir. Required for the harness to pass cleanly under the commit hook.

Review Notes

  • The staging worker (https://cve-worker-staging.corgea.workers.dev) is the current default endpoint (DEFAULT_VULN_API_URL); the production-worker handoff and seed-data ownership are open questions for later phases, not blockers for Phase 0.
  • Stub key normalization mirrors the client's, so a fixture keyed under an alternate PyPI spelling still serves a request for the canonical name.

Running the suite from a git hook (e.g. pre-commit in a worktree) leaks
GIT_DIR into the tests' subprocesses, pointing their git init at the
developer's repo — locked mid-commit — instead of the temp dir.
The vuln-api client and its versioned contract (clean / vulnerable /
malware / unknown verdicts, remediation data), harvested from the
install-vuln-gate spike (dfac68e) and trimmed to phase scope: public
unauthenticated lookups only, no retries, no user-facing command.

- src/vuln_api: blocking client for /v1/packages/.../check with status
  mapping, identity guard, and PEP 503 request-time normalization
- src/vuln_api_stub: in-process TCP stub, gated out of release builds
  via the test-stub feature + self dev-dependency
- tests/common: shared GateHarness scaffold for later phases
- tests/vuln_api_contract.rs: contract tests against the stub (hermetic)
  and the staging worker (#[ignore], deterministic targets documented in
  tests/fixtures/vuln_api/README.md)
Comment thread src/vuln_api/mod.rs Outdated
Comment thread tests/common/mod.rs
Comment thread tests/vuln_api_contract.rs
…, staging CI

Addresses Cursor review on #110.

- vuln_api identity guard now applies the ecosystem's canonical-name rule
  to the response package_name before comparing, not just
  eq_ignore_ascii_case. A response echoing the stored spelling
  (`flask_cors` for a `flask-cors` request — PEP 503-equivalent) no longer
  trips the guard and fails the gate closed for valid pypi packages with
  `_`/`.` in their names. New unit test covers it.
- tests/harness_smoke.rs exercises the GateHarness scaffold directly (fake
  package manager on PATH, registry stub, vuln-api stub) so the wiring
  can't silently regress before a later phase drives it end-to-end.
- .github/workflows/staging-contract.yml runs the #[ignore]d staging
  contract tests on a daily schedule (non-blocking) so endpoint/schema/seed
  drift is caught out-of-band instead of shipping undetected.
cargo test runs the module's tests on parallel threads. Five of them bind
ephemeral ports, and between port_is_available_reflects_current_port_usage's
drop(listener) and its re-check, a concurrent :0 bind could be handed the
just-freed port, flipping the second assert to false.

Add a module-level PORT_TEST_LOCK that all five port-binding tests acquire.
The async test scopes the guard to the synchronous reserve so no lock is held
across .await (keeps clippy::await_holding_lock clean).
…riant docs, harness opt-out

- pypi wire names now use the server's rule (lowercase + trim,
  worker.js normalizePackageName), NOT PEP 503: collapsing
  zope.interface to zope-interface missed the stored advisory row and
  read vulnerable dotted/underscored packages as clean. PEP 503 remains
  the identity-comparison rule (Ecosystem::request_name vs
  normalize_name), and the stub's key() now mirrors the server so the
  divergence can't be masked in tests again.
- Module/auth docs no longer claim lookups are permanently
  unauthenticated: production /check requires a Corgea token (staging
  runs VULN_API_REQUIRE_AUTH=false); token wiring lands with
  authenticated mode. Test renamed public_check_sends_no_auth_headers.
- GateHarness::without_vuln_api() opt-out for no-endpoint tests.
- utils/api.rs get_source() delegates to the cached vuln_api::source().
…coding, clippy --all-targets

- Reject contradictory verdicts in both directions (is_vulnerable must agree
  with matches presence); false+non-empty was the dangerous false-negative.
- VulnMatch.tier: u8 -> Option<u8> so a null/missing tier no longer fails the
  whole response (server emits row.tier unclamped on /check).
- encode_npm_name percent-encodes every component (scoped + unscoped); output
  is identical for valid names, robust against reserved chars (matches pypi).
- Correct two factually-wrong comments (npm casing; 'staging spells PyPI')
  against the real server (worker.js echoes ecosystem/name/version verbatim).
- source() -> &'static str and validated_base() -> &str: drop per-request
  clone / per-call alloc on the per-package hot path.
- Harness strict clippy now --all-targets (lints tests + the test-stub module);
  cleared the two doc_lazy_continuation warnings it surfaced.
Comment thread .github/workflows/staging-contract.yml Outdated
Comment thread src/utils/api.rs
std::env::var("CORGEA_SOURCE").unwrap_or_else(|_| "cli".to_string())
fn get_source() -> &'static str {
// One definition of the CORGEA-SOURCE value (cached there).
corgea::vuln_api::source()

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of create a separate client Its better to refactor this to optionally include authorisation headers so that any debugging logic and error handling is centralised in one place

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree the debug logging can be centralized, and I'll pull that into a shared helper as a follow-up. I'd keep the clients separate though: the vuln-api host is user-configurable and the shared client is auth/cookie/redirect-enabled by construction (cookies and redirects are fixed at build time, so optional auth alone won't cover it), so merging would add footguns to the auth-bearing client for ~12 lines of setup.

@Ibrahimrahhal Ibrahimrahhal left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How're we going to handle the auth experience to this new API, are going to have two login commands?

Comment thread .github/workflows/staging-contract.yml
@juangaitanv

Copy link
Copy Markdown
Contributor Author

How're we going to handle the auth experience to this new API, are going to have two login commands?

no auth required to reduce the adoption friction, for now.

Record why the vuln-api client is deliberately separate from the shared
CLI client (user-configurable host must not replay Corgea credentials),
point at the enforcing test, and drop the play-by-play of utils::api
internals so the comment can't go stale when that module changes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants